Search CORE

22 research outputs found

Multi-party Poisoning through Generalized $p$ -Tampering

Author: Mahloujifar Saeed
Mahmoody Mohammad
Mohammed Ameer
Publication venue
Publication date: 11/09/2018
Field of study

In a poisoning attack against a learning algorithm, an adversary tampers with a fraction of the training data

T

with the goal of increasing the classification error of the constructed hypothesis/model over the final test distribution. In the distributed setting,

T

might be gathered gradually from

m

data providers

P_1,\dots,P_m

who generate and submit their shares of

T

in an online way. In this work, we initiate a formal study of

(k,p)

-poisoning attacks in which an adversary controls

k\in[n]

of the parties, and even for each corrupted party

P_i

, the adversary submits some poisoned data

T'_i

on behalf of

P_i

that is still "

(1-p)

-close" to the correct data

T_i

(e.g.,

1-p

fraction of

T'_i

is still honestly generated). For

k=m

, this model becomes the traditional notion of poisoning, and for

p=1

it coincides with the standard notion of corruption in multi-party computation. We prove that if there is an initial constant error for the generated hypothesis

h

, there is always a

(k,p)

-poisoning attacker who can decrease the confidence of

h

(to have a small error), or alternatively increase the error of

h

, by

\Omega(p \cdot k/m)

. Our attacks can be implemented in polynomial time given samples from the correct data, and they use no wrong labels if the original distributions are not noisy. At a technical level, we prove a general lemma about biasing bounded functions

f(x_1,\dots,x_n)\in[0,1]

through an attack model in which each block

x_i

might be controlled by an adversary with marginal probability

p

in an online way. When the probabilities are independent, this coincides with the model of

p

-tampering attacks, thus we call our model generalized

p

-tampering. We prove the power of such attacks by incorporating ideas from the context of coin-flipping attacks into the

p

-tampering model and generalize the results in both of these areas

arXiv.org e-Print Archive

Cryptology ePrint Archive

Bounding Training Data Reconstruction in DP-SGD

Author: Balle Borja
Hayes Jamie
Mahloujifar Saeed
Publication venue
Publication date: 14/02/2023
Field of study

Differentially private training offers a protection which is usually interpreted as a guarantee against membership inference attacks. By proxy, this guarantee extends to other threats like reconstruction attacks attempting to extract complete training examples. Recent works provide evidence that if one does not need to protect against membership attacks but instead only wants to protect against training data reconstruction, then utility of private models can be improved because less noise is required to protect against these more ambitious attacks. We investigate this further in the context of DP-SGD, a standard algorithm for private deep learning, and provide an upper bound on the success of any reconstruction attack against DP-SGD together with an attack that empirically matches the predictions of our bound. Together, these two results open the door to fine-grained investigations on how to set the privacy parameters of DP-SGD in practice to protect against reconstruction attacks. Finally, we use our methods to demonstrate that different settings of the DP-SGD parameters leading to the same DP guarantees can result in significantly different success rates for reconstruction, indicating that the DP guarantee alone might not be a good proxy for controlling the protection against reconstruction attacks

arXiv.org e-Print Archive

Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation

Author: Mahloujifar Saeed
Mittal Prateek
Sehwag Vikash
Wang Tianhao
Wu Tong
Publication venue
Publication date: 21/07/2022
Field of study

Recent works have demonstrated that deep learning models are vulnerable to backdoor poisoning attacks, where these attacks instill spurious correlations to external trigger patterns or objects (e.g., stickers, sunglasses, etc.). We find that such external trigger signals are unnecessary, as highly effective backdoors can be easily inserted using rotation-based image transformation. Our method constructs the poisoned dataset by rotating a limited amount of objects and labeling them incorrectly; once trained with it, the victim's model will make undesirable predictions during run-time inference. It exhibits a significantly high attack success rate while maintaining clean performance through comprehensive empirical studies on image classification and object detection tasks. Furthermore, we evaluate standard data augmentation techniques and four different backdoor defenses against our attack and find that none of them can serve as a consistent mitigation approach. Our attack can be easily deployed in the real world since it only requires rotating the object, as we show in both image classification and object detection applications. Overall, our work highlights a new, simple, physically realizable, and highly effective vector for backdoor attacks. Our video demo is available at https://youtu.be/6JIF8wnX34M.Comment: 25 page

arXiv.org e-Print Archive